{ "cells": [ { "cell_type": "markdown", "id": "c34f0dfb", "metadata": {}, "source": [ "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/niconoe/pyinaturalist/main?filepath=examples%2FData%2520Visualizations%2520-%2520Regional%2520Activity%2520Report.ipynb)" ] }, { "cell_type": "markdown", "id": "attached-priority", "metadata": {}, "source": [ "# Regional activity time series visualizations\n", "This example shows how to create visualizations of iNaturalist activity over time in a given region.\n", "See https://www.inaturalist.org/places to find place IDs.\n", "\n", "Visualization are made using [Altair](https://altair-viz.github.io), with the following metrics:\n", "* Number of observations\n", "* Number of taxa observed\n", "* Number of observers\n", "* Number of identifiers" ] }, { "cell_type": "code", "execution_count": 1, "id": "prostate-overall", "metadata": {}, "outputs": [], "source": [ "from datetime import datetime\n", "from time import sleep\n", "\n", "from dateutil.relativedelta import relativedelta\n", "from IPython.display import Image\n", "from typing import Any, BinaryIO, Dict, Iterable, List, Optional, Tuple\n", "\n", "import altair as alt\n", "import pandas as pd\n", "\n", "from pyinaturalist import (\n", " get_observations,\n", " get_observation_histogram,\n", " get_observation_species_counts,\n", " get_observation_observers,\n", " get_observation_identifiers,\n", ")\n", "from pyinaturalist.constants import ICONIC_TAXA\n", "from pyinaturalist.request_params import get_interval_ranges\n", "\n", "# Adjustable values\n", "PLACE_ID = 6\n", "PLACE_NAME = 'Alaska'\n", "YEAR = 2020\n", "\n", "THROTTLING_DELAY = 1.0 # Time to wait in between subsequent requests" ] }, { "cell_type": "markdown", "id": "angry-longer", "metadata": {}, "source": [ "### Observations per year" ] }, { "cell_type": "code", "execution_count": 2, "id": "joint-interference", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "observations_by_year = get_observation_histogram(\n", " place_id=PLACE_ID,\n", " interval='year',\n", " d1='2008-01-01',\n", " d2=f'{YEAR}-12-31',\n", " verifiable=True,\n", ")\n", "observations_by_year = pd.DataFrame([\n", " {'date': k, 'observations': v}\n", " for k, v in observations_by_year.items()\n", "])\n", "\n", "# Including the rendered image so the chart will display outside Jupyter, e.g. on GitHub's notebook viewer\n", "Image('images/observations_by_year.png')\n", "alt.Chart(observations_by_year).mark_bar().encode(x='year(date):T', y='observations:Q')" ] }, { "cell_type": "markdown", "id": "invisible-needle", "metadata": {}, "source": [ "### Observations per month" ] }, { "cell_type": "code", "execution_count": 3, "id": "dietary-tours", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "observations_by_month = get_observation_histogram(\n", " place_id=PLACE_ID,\n", " interval='month',\n", " d1='2020-01-02',\n", " d2='2020-12-31',\n", " verifiable=True,\n", ")\n", "observations_by_month = pd.DataFrame([\n", " {'metric': 'Observations', 'date': k, 'count': v}\n", " for k, v in observations_by_month.items()\n", "])\n", "Image('images/observations_by_month.png')\n", "alt.Chart(observations_by_month).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "genetic-camping", "metadata": {}, "source": [ "### Histograms with custom metrics\n", "The API does not have a histogram endpoint for taxa observed, observers, or identifiers,\n", "so we first need to determine our date ranges of interest, and then run one search per date range.\n", "\n", "Here are a couple helper functions to make this easier:" ] }, { "cell_type": "code", "execution_count": 4, "id": "duplicate-attribute", "metadata": {}, "outputs": [], "source": [ "def count_date_range_results(function, start_date, end_date):\n", " \"\"\"Get the count of results for the given date range and search function\"\"\"\n", " # Running this search with per_page=0 will (quickly) return only a count of results, not complete results\n", " response = function(\n", " place_id=PLACE_ID,\n", " d1=start_date,\n", " d2=end_date,\n", " verifiable=True,\n", " per_page=0,\n", " )\n", " print(f'Total results for {start_date.strftime(\"%b\")}: {response[\"total_results\"]}')\n", " return response['total_results']\n", " if start_date.month != 12:\n", " sleep(THROTTLING_DELAY)\n", "\n", "\n", "def get_monthly_counts(function, label):\n", " \"\"\"Get the count of results per month for the given search function\"\"\"\n", " month_ranges = get_interval_ranges(datetime(YEAR, 1, 1), datetime(YEAR, 12, 31), 'monthly')\n", " counts_by_month = {\n", " start_date: count_date_range_results(function, start_date, end_date)\n", " for (start_date, end_date) in month_ranges\n", " }\n", " return pd.DataFrame([{'metric': label, 'date': k, 'count': v} for k, v in counts_by_month.items()])" ] }, { "cell_type": "markdown", "id": "induced-stone", "metadata": {}, "source": [ "### Unique taxa observed per month" ] }, { "cell_type": "code", "execution_count": 5, "id": "exempt-victor", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total results for Jan: 184\n", "Total results for Feb: 176\n", "Total results for Mar: 318\n", "Total results for Apr: 790\n", "Total results for May: 1334\n", "Total results for Jun: 1504\n", "Total results for Jul: 1684\n", "Total results for Aug: 1570\n", "Total results for Sep: 1250\n", "Total results for Oct: 639\n", "Total results for Nov: 408\n", "Total results for Dec: 550\n" ] }, { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "taxa_by_month = get_monthly_counts(get_observation_species_counts, 'Taxa')\n", "Image('images/taxa_by_month.png')\n", "alt.Chart(taxa_by_month).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "brief-daniel", "metadata": {}, "source": [ "### Observers per month" ] }, { "cell_type": "code", "execution_count": 6, "id": "generous-candy", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total results for Jan: 36\n", "Total results for Feb: 42\n", "Total results for Mar: 71\n", "Total results for Apr: 141\n", "Total results for May: 361\n", "Total results for Jun: 458\n", "Total results for Jul: 530\n", "Total results for Aug: 563\n", "Total results for Sep: 404\n", "Total results for Oct: 174\n", "Total results for Nov: 86\n", "Total results for Dec: 51\n" ] }, { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "observers_by_month = get_monthly_counts(get_observation_observers, 'Observers')\n", "Image('images/observers_by_month.png')\n", "alt.Chart(observers_by_month).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "earlier-warren", "metadata": {}, "source": [ "### Identifiers per month" ] }, { "cell_type": "code", "execution_count": 7, "id": "parliamentary-edward", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Total results for Jan: 135\n", "Total results for Feb: 152\n", "Total results for Mar: 187\n", "Total results for Apr: 349\n", "Total results for May: 619\n", "Total results for Jun: 602\n", "Total results for Jul: 662\n", "Total results for Aug: 616\n", "Total results for Sep: 492\n", "Total results for Oct: 314\n", "Total results for Nov: 219\n", "Total results for Dec: 208\n" ] }, { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "identifiers_by_month = get_monthly_counts(get_observation_identifiers, 'Identifiers')\n", "Image('images/identifiers_by_month.png')\n", "alt.Chart(identifiers_by_month).mark_bar().encode(x='month(date):T', y='count:Q')" ] }, { "cell_type": "markdown", "id": "another-ambassador", "metadata": {}, "source": [ "### Combine all monthly metrics into one plot" ] }, { "cell_type": "code", "execution_count": 8, "id": "serious-literacy", "metadata": { "tags": [ "nbsphinx-thumbnail" ] }, "outputs": [ { "data": { "text/html": [ "\n", "
\n", "" ], "text/plain": [ "alt.Chart(...)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined_results = observations_by_month.append([taxa_by_month, observers_by_month, identifiers_by_month])\n", "\n", "Image('images/combined_activity_stats.png')\n", "alt.Chart(\n", " combined_results,\n", " title=f'iNaturalist activity in {PLACE_NAME} ({YEAR})',\n", " width=750,\n", " height=500,\n", ").mark_line().encode(\n", " alt.X('month(date):T', axis=alt.Axis(title=\"Month\")),\n", " alt.Y('count:Q', axis=alt.Axis(title=\"Count\")),\n", " color='metric',\n", " strokeDash='metric',\n", ").configure_axis(\n", " labelFontSize=15,\n", " titleFontSize=20,\n", ")" ] } ], "metadata": { "celltoolbar": "Tags", "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.2" } }, "nbformat": 4, "nbformat_minor": 5 }